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Abstract. We give a new randomized LP-rounding 1.725-approximation 
algorithm for the metric Fault-Tolerant Uncapacitated Facility Location 
problem. This improves on the previously best known 2.076-approximation 
algorithm of Swamy & Shmoys. To the best of our knowledge, our work 
provides the first application of a dependent-rounding technique in the 
domain of facility location. The analysis of our algorithm benefits from, 
and extends, methods developed for Uncapacitated Facility Location; it 
also helps uncover new properties of the dependent-rounding approach. 
An important concept that we develop is a novel, hierarchical clustering 
scheme. Typically, LP-rounding approximation algorithms for facility lo- 
cation problems are based on partitioning facilities into disjoint clusters 
and opening at least one facility in each cluster. We extend this approach 
and construct a laminar family of clusters, which then guides the round- 
ing procedure. It allows to exploit properties of dependent rounding, and 
provides a quite tight analysis resulting in the improved approximation 
ratio. 



1 Introduction 

In Facility Location problems wc arc given a set of clients C that require a certain 
service. To provide such a service, we need to open a subset of a given set of 
facilities T. Opening each facility i <G J- costs fa and serving a client j by facility 
i costs Cij ; the standard assumption is that the Cij are symmetric and constitute 
a metric. (The non-metric case is much harder to approximate.) In this paper, 
we follow Swamy & Shmoys [10] and study the Fault- Tolerant Facility Location 
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(FTFL) problem, where each client has a positive integer specified as its coverage 
requirement rj. The task is to find a minimum-cost solution which opens some 
facilities from T and connects each client j to rj different open facilities. 

The FTFL problem was introduced by Jain & Vazirani [6]. Guha et al. [5] 
gave the first constant factor approximation algorithm with approximation ra- 
tio 2.408. This was later improved by Swamy & Shmoys [10] who gave a 2.076- 
approximation algorithm. FTFL generalizes the standard Uncapacitated Facility 
Location (UFL) problem wherein rj = 1 for all j, for which Guha & Khuller [4] 
proved an approximation lower bound of sw 1.463. The current-best approxima- 
tion ratio for UFL is achieved by the 1.5-approximation algorithm of Byrka [2]. 

In this paper we give a new LP-rounding 1.7245-approximation algorithm 
for the FTFL problem. It is the first application of the dependent rounding 
technique of [9] to a facility location problem. 

Our algorithm uses a novel clustering method, which allows clusters not to 
be disjoint, but rather to form a laminar family of subsets of facilities. The hi- 
erarchical structure of the obtained clustering exploits properties of dependent 
rounding. By first rounding the "facility-opening" variables within smaller clus- 
ters, we are able to ensure that at least a certain number of facilities is open 
in each of the clusters. Intuitively, by allowing clusters to have different sizes 
we may, in a more efficient manner, guarantee the opening of sufficiently-many 
facilities around clients with different coverage requirements rj . In addition, one 
of our main technical contributions is Theorem 2, which develops a new property 
of the dependent-rounding technique that appears likely to have further appli- 
cations. Basically, suppose we apply dependent rounding to a sequence of reals 
and consider an arbitrary subset S of the rounded variables (each of which lies 
in {0, 1}) as well as an arbitrary integer k > 0. Then, a natural fault-tolerance- 
rclatcd objective is that if X denotes the number of variables rounded to 1 in S, 
then the random variable Z = min{fc,X} be "large". (In other words, we want 
X to be "large" , but X being more than k does not add any marginal utility.) We 
prove that if Xq denotes the corresponding sum wherein the reals are rounded 
independently and if Zq = min{fc,Xo}, then E[Z] > E[Zo]. Thus, for analysis 
purposes, we may work with Zq, which is much more tractable due to the in- 
dependence; at the same time, we derive all the benefits of dependent rounding 
(such as a given number of facilities becoming available in a cluster, with prob- 
ability one). Given the growing number of applications of dependent- rounding 
methodologies, we view this as a useful addition to the toolkit. 

2 Dependent rounding 

Given a fractional vector y = (2/1,2/2, •■ • ,2/Jv) G [0, 1] N we often seek to round 
it to an integral vector y G {0,1}^ that is in a problem-specific sense very 
"close to" y. The dependent-randomized-rounding technique of [9] is one such 
approach known for preserving the sum of the entries deterministically, along 
with concentration bounds for any linear combination of the entries; we will 
generalize a known property of this technique in order to apply it to the FTFL 



problem. The very useful pipage rounding technique of [1] was developed prior 
to [9], and can be viewed as a derandomization (deterministic analog) of [9] via 
the method of conditional probabilities. Indeed, the results of [1] were applied in 
the work of [10]; the probabilistic intuition, as well as our generalization of the 
analysis of [9], help obtain our results. 

Define [t] = {1,2,..., t}. Given a fractional vector y = (y l5 y 2 , ■ ■ ■ , J/jv) £ 
[0, 1]^, the rounding technique of [9] (henceforth just referred to as "dependent 
rounding" ) is a polynomial-time randomized algorithm to produce a random 
vector y £ {0,1}^ with the following three properties: 

(PI): marginals. Vi, Pr[j/j = 1] = yf, 

(P2): sum-preservation. With probability one, Yli=i Vi e( l ua l s either |_Ei=i Vi\ 
or \T,iLi Vi\ i and 

(P3): negative correlation. VS* C [N], Pr[A ieS (j/i = 0)] < ILesC 1 and 

Pr[Ai 6 s(& = 1)] <TliesVi- 
The dependent-rounding algorithm is described in Appendix A. In this paper, 
we also exploit the order in which the entries of the given fractional vector y 
are rounded. We initially define a laminar family of subsets of indices 
When applying the dependent rounding procedure, we first round within the 
smaller sets, until at most one fractional entry in a set is left, then we proceed 
with bigger sets possibly containing the already rounded entries. It can easily 
be shown that it assures the following version of property (P2) for all subsets S 
from the laminar family S: 

(P2'): sum-preservation. With probability one, X^igs^ = TliesVi 
and \{i 6 S : y l = 1}| = LEieS^J- 

Now, let S C [N] be any subset, not necessarily from S. In order to present 
our results, we need two functions, Sums and g\.s- For any vector x € [0, 1]", 
let Sums(i) = Eies x i be the sum of the elements of x indexed by elements of 
S; in particular, if x is a (possibly random) vector with all entries either or 1, 
then Sumg(a:) counts the number of entries in S that are 1. Next, given s = \S\ 
and a real vector A = (Ao, Ai, A2, . . . , A s ), we define, for any x G {0, 1}™, 

s 

9\,s(x) = ^2 Xi ■ X(Sum s (a;) = i), 

i=0 

where !(•) denotes the indicator function. Thus, g\,s{ x ) = A,; if Sums (a;) = i. 

Let TZ(y) be a random vector in {0, 1}^ obtained by independently rounding 
each yi to 1 with probability yi, and to with the complementary probability 
of 1 — yi. Suppose, as above, that y is a random vector in {0,1}^ obtained 
by applying the dependent rounding technique to y. We start with a general 
theorem and then specialize it to Theorem 2 that will be very useful for us: 

Theorem 1. Suppose we conduct dependent rounding on y — (j/i, 1/2, • ■ • , Vn)- 
Let S C [N] be any subset with cardinality s > 2, and let A = (Ao, Ai, A2, ■ • ■ , A s ) 
be any vector, such that for all r with < r < s — 2 we have A r — 2A r -)-i+A r +2 < 0. 
Then, E\gx,s(y)]>E\gx,s(K(v))]- 



Theorem 2. For any y G [0, 1} N , S C [N], and k = l,2,..., we have 
E[min{fc, Sums{y)}] > E[min{fc, 5ums(7£(2/))}]. 
Using the notation exp(i) = e*, our next key result is: 
Theorem 3. For any y G [0, 1]^, S C [AT], and fc = 1, 2, . . we have 
E[min{fc, Sum s (TZ(y))}} > k ■ (1 - exp(-Sum s (y)/k)). 

The above two theorems yield a key corollary that we will use: 
Corollary 1. 

E[min{fc, Sums(y)}} > k ■ (1 — exp(— Sums (y) I • 
Proofs of the theorems from this section are provided in Appendix B. 

3 Algorithm 
3.1 LP-relaxation 

The FTFL problem is defined by the following Integer Program (IP). 

minimize J2 ie r fM + EjeC dj^n (!) 

subject to: x%j > rj Vj E C (2) 

Xij < yi Vj G C Vi G T (3) 

1/i < 1 V^ G T (4) 

5C« > 2/i e ^> Vj G C Vi G J 7 , (5) 

where C is the set of clients, T is the set of possible locations of facilities, fi is 
a cost of opening a facility at location i, dj is a cost of serving client j from 
a facility at location i, and rj is the amount of facilities client j needs to be 
connected to. 

If we relax constraint (5) to x^, yi > we obtain the standard LP-relaxation 
of the problem. Let (x* ,y*) be an optimal solution to this LP relaxation. We will 
give an algorithm that rounds this solution to an integral solution (x,y) with 
cost at most 7 pa 1.7245 times the cost of (x*,y*). 



3.2 Scaling 

We may assume, without loss of generality, that for any client j G C there exists 
at most one facility i G T such that < Xij < yi. Moreover, this facility may 
be assumed to have the highest distance to client j among the facilities that 
fractionally serve j in (x*,y*). 

We first set £y = y% = for alii G J 7 , j G C. Then we scale up the fractional 
solution by the constant 7 as 1.7245 to obtain a fractional solution (x,y). To be 



precise: we set £y = min{l,7 • x*j}, iji = min{l,7 • y*}. We open each facility 
i with iji = 1 and connect each client- facility pair with Xij = 1. To be more 
precise, we modify y, y, x, x and service requirements r as follows. For each 
facility i with jji = 1, set iji = and yi = 1. Then, for every pair such 
that x^ = 1, set = 0, = 1 and decrease r, by one. When this process is 
finished we call the resulting r, y and x by r, y and 3;. Note that the connections 
that we made in this phase may be paid for by a difference in the connection 
cost between x and x. We will show that the remaining connection cost of the 
solution of the algorithm is expected to be at most the cost of x. 

For the feasibility of the final solution, it is essential that if we connected 
client j to facility i in this initial phase, we will not connect it again to i in the 
rest of the algorithm. There will be two ways of connecting clients in the process 
of rounding x. The first one connects client j to a subset of facilities serving j in 
x. Recall that if j was connected to facility i in the initial phase, then Xij = 0, 
and no additional i-j connection will be created. 

The connections of the second type will be created in a process of clustering. 
The clustering that we will use is a generalization of the clustering used by 
Chudak & Shmoys for the UFL problem [3]. As a result of this clustering process, 
client j will be allowed to connect itself via a different client j' to a facility open 
around j'. j' will be called a cluster center for a subset of facilities, and it 
will make sure that at least some guaranteed number of these facilities will get 
opened. 

To be certain that client j does not get again connected to facility i with a 
path via client j' , facility i will never be a member of the set of facilities clustered 
by client j'. We call a facility i special for client j iff jji = 1 and < Xij < I. 
Note that, by our earlier assumption, there is at most one special facility for 
each client j, and that a special facility must be at maximal distance among 
facilities serving j in x. When rounding the fractional solution in Section 3.5, we 
take care that special facilities are not members of the formed clusters. 

3.3 Close and distant facilities 

Before we describe how do we cluster facilities, we specify the facilities that are 
interesting for a particular client in the clustering process. The following can be 
fought of as a version of a filtering technique of Lin and Vitter [7] , first applied 
to facility location by Shmoys et al. [8]. The analysis that we use here is a version 
of the argument of Byrka [2] . 

As a result of the scaling that was described in the previous section, the con- 
nection variables x amount for a total connectivity that exceeds the requirement 
r. More precisely, we have ~^2 ie jrXij > 7 • f j for every client j S C. We will 
consider for each client j a subset of facilities that are just enough to provide it 
a fractional connection of Tj. Such a subset is called a set of close facilities of 
client j and is defined as follows. 

For every client j consider the following construction. Let i±, 12, ■ ■ ■ , i\j^\ be the 
ordering of facilities in J 7 in a nondecreasing order of distances cy to client j. Let 



ik be the facility in this ordering, such that Y^i=i x kj < r j an d S/=i x nj — r j- 
Define 

Jx^j for I < k, 

for Z > k 

Define 3?^ = % — for all i G J-, j G C. 

We will call the set of facilities i £ J 7 such that afj? > the set of close 
facilities of client j and we denote it by Cj. By analogy, we will call the set of 

facilities i G J such that x^ > the set of distant facilities of client j and 
denote it Dj . Observe that for a client j the intersection of Cj and Dj is cither 
empty, or contains exactly one facility. In the latter case, we will say that this 
facility is both distant and close. Note that, unlike in the UFL problem, we 
may not simply split this facility to the close and the distant part, because it is 
essential that we make at most one connection to this facility in the final integral 
solution. Let ( f™ ,ax ^ > = Ci k j be the distance from client j to the farthest of its 
close facilities. 



3.4 Clustering 

We will now construct a family of subsets of facilities 5 £ 2 jr . These subsets S € 
S will be called clusters and they will guide the rounding procedure described 
next. There will be a client related to each cluster, and each single client j will 
be related to at most one cluster, which we call Sj. 

Not all the clients participate in the clustering process. Clients j with fj = 1 
and a special facility i' G Cj (recall that a special facility is a facility that is fully 
open in y but only partially used by j in x) will be called special and will not 
take part in the clustering process. Let C denote the set of all other, non-special 
clients. Observe that, as a result of scaling, clients j with fj > 2 do not have any 
special facilities among their close facilities (since J^i^ij — T^i > + !)■ As 
a consequence, there are no special facilities among the close facilities of clients 
from C, the only clients actively involved in the clustering procedure. 

For each client j G C we will keep two families Aj and Bj of disjoint subsets 
of facilities. Initially Aj = {{i} : i G Cj}, i.e., Aj is initialized to contain a 
singleton set for each close facility of client j; Bj is initially empty. Aj will be 
used to store these initial singleton sets, but also clusters containing only close 
facilities of j; Bj will be used to store only clusters that contain at least one 
close facility of j. When adding a cluster to either Aj or Bj we will remove all 
the subsets it intersects from both Aj and Bj , therefore subsets in Aj U Bj will 
always be pairwise disjoint. 

The family of clusters that we will construct will be a laminar family of sub- 
sets of facilities, i.e., any two clusters are either disjoint or one entirely contains 
the other. One may imagine facilities being leaves and clusters being internal 
nodes of a forest that eventually becomes a tree, when all the clusters are added. 

We will use y(S) as a shorthand for J^i^s Vi- Let us define y(S) = [y{S)\ . As 
a consequence of using the family of clusters to guide the rounding process, by 



Property (P2') of the dependent rounding procedure when applied to a cluster, 
th quantity y(S) lower bounds the number of facilities that will certainly be 
opened in cluster S. Additionally, let us define the residual requirement of client 
j to be rrj = Tj — Ese(yt uB-) ^ na ^ * s r? mmus a lower bound on the 

number of facilities that will be opened in clusters from Aj and Bj . 

We use the following procedure to compute clusters. While there exists a 
client j G C', such that rrj > 0, take such j with minimal <^. max ) anc ] d the 
following: 

1. Take Xj to be an inclusion- wise minimal subset of Aj, such that Esex- 
2/(S)) > rrj. Form the new cluster Sj = [JseXj 

2. Make Sj a new cluster by setting S <— S U {Sj}. 

3. Update A 3 «- (Aj \ Xj) U {Sj}. 

4. For each client j' with rrj' > do 

- If Xj C A jt , then set Ay <- [Aj, \ Xj) U {Sj}. 

- If X 3 n A y ^ and Xj \ Ay ^ 0, 

then set Ay <- Ay \ Xj and By <- {S G By : S n Sj = 0} U {Sj}. 

Eventually, add a cluster S r = F containing all the facilities to the family S. 

We call a client j' active in a particular iteration, if before this iteration 
its residual requirement rrj = Tj — Ylse(AuB) v(^) was positive. During the 
above procedure, all active clients j have in their sets Aj and Bj only maximal 
subsets of facilities, that means they are not subsets of any other clusters (i.e., 
they are roots of their trees in the current forest). Therefore, when a new cluster 
Sj is created, it contains all the other clusters with which it has nonempty 
intersections (i.e., the new cluster Sj becomes a root of a new tree). 

We shall now argue that there is enough fractional opening in clusters in Aj 
to cover the residual requirement rrj when cluster Sj is to be formed. Consider 
a fixed client j G C. Recall that at the start of the clustering we have Aj = 
{{i} : i G Cj}, and therefore T,s&A ] (y( S ) - v( s )) = Eiec^i ^ T i = rr j- :t 
remains to show, that YlseA^i^) ~ v(S)) ~ rr j does not decrease over time 

until client j is considered. When a client j' with dfy 1 < <^ mox ' i s considered 
and cluster Sj' is created, the following cases are possible: 

1. Sj' Pi (UseA S) = tncn -4j an d rr j do not change; 

2. Sj> C ({J SeA . S): then Aj changes its structure, but EseA V(S) anc ^ Eses v(^) 
do not change; hence Ese^j (^(S) ~ v(S)) ~ rr i a ^ so does not change; 

3. Sj>n(U SeA . S) ^ and Sj'\(U SeAj S) ^ 0: then, by inclusion-wise minimal- 
ity of set Xy, we have y(Sy) - Eses^scs,, v(S) - EseA^ScSy v( S ) > °! 
hence, ^seAjivi^) ~ v(S)) ~ rr i cann °t decrease. 

Let Aj = Aj U S be the set of clusters in Aj. Recall that all facilities in 
clusters in A'j are close facilities of j. Note also that each cluster Sj> G Bj was 

created from close facilities of a client j' with (fp ax ^ < d!j nax \ We also have for 
each Sj' G Bj that Sj> fl Cj =^ 0, hence, by the triangle inequality, all facilities 
in Sj> arc at distance at most 3 • d^ ax ^ from j. We thus infer the following 



Corollary 2. The family of clusters S contains for each client j G C a collec- 
tion of disjoint clusters A'jUBj containing only facilities within distance 3-dj , 

and i2s<-A' ] vjB ] VL l t S y l \ > t j- 

Note that our clustering is related to, but more complex then the one of 
Chudak and Shmoys [3] for UFL and of Swamy and Shmoys [10] for FTFL, 
where clusters are pairwise disjoint and each contains facilities whose fractional 
opening sums up to or slightly exceeds the value of 1. 

3.5 Opening of facilities by dependent rounding 

Given the family of subsets S G 2^ computed by the clustering procedure from 
Section 3.4, we may proceed with rounding the fractional opening vector y into 
an integral vector y R . We do it by applying the rounding technique of Section 2, 
guided by the family <S, which is done as follows. 

While there is more than one fractional entry, select a minimal subset of 
S G S which contains more than one fractional entry and apply the rounding 
procedure to entries of y indexed by elements of S until at most one entry in 
5* remains fractional. Eventually, if there remains a fractional entry, round it 
independently and let y R be the resulting vector. 

Observe that the above process is one of the possible implementations of 
dependent rounding applied to y. As a result, the random integral vector y R 
satisfies properties (P1),(P2), and (P3). Additionally, property (P2') holds for 
each cluster S G S. Hence, at least LSies Vi\ entries in each S G S are rounded 
to 1. Therefore, by Corollary 2, we get 

Corollary 3. For each client j G C . 

\{i G F\y R = 1 and aj < 3 • d [ ™ ax) }\ > r 3 . 

Next, we combine the facilities opened by rounding y R with facilities opened 
already when scaling which are recorded in y, i.e., we update y <— y + y R . 

Eventually, we connect each client j G C to Tj closest opened facilities and 
code it in x. 

4 Analysis 

We will now estimate the expected cost of the solution (i, y). The tricky part is 
to bound the connection cost, which we do as follows. We argue that a certain 
fraction of the demand of client j may be satisfied from its close facilities, then 
some part of the remaining demand can be satisfied from its distant facilities. 
Eventually, the remaining (not too large in expectation) part of the demand is 
satisfied via clusters. 



4.1 Average distances 



Let us consider weighted average distances from a client j to sets of facilities 
fractionally serving it. Let dj be the average connection cost in defined as 



j 



Let dj , dj be the average connection costs in x^' and ar^ defined as 

d (c) _ 2. i£ F ■ 



J V T (C) ' 

-(d) 

,(d) _ l^ier c v ' fjj 
Let i?j be a parameter defined as 

if dj > and i?j = otherwise. Observe that Rj takes value between and 1. 
Rj = implies d^ = dj = dj , and Rj = 1 occurs only when dj = 0. The role 
played by Rj is that it measures a certain parameter of the instance, big values 
are good for one part of the analysis, small values are good for the other. 

Lemma 1. d\ d) < dAl + -%). 

Proof. Recall that J2iej^ x if = ^3 an< ^ Y^ieF^ij — (l ~ 1) ' ^j- Therefore, we 
have (dj — dj) ■ (7 — 1) < (dj — dj) ■ 1 = Rj ■ dj, which can be rearranged to 
getdf < dj (l + ^- 1 ) 



Finally, observe that the average distance from j to the distant facilities of 
j gives an upper bound on the maximal distance to any of the close facilities of 
j. Namely, d [ " nax) < d[ d) . 



4.2 Amount of service from close and distant facilities 

We now argue that in the solution (x, y), a certain portion of the demand is 
expected to be served by the close and distant facilities of each client. Recall 
that for a client j it is possible, that there is a facility that is both its close 
and its distant facility. Once we have a solution that opens such a facility, we 
would like to say what fraction of the demand is served from the close facilities. 
To make our analysis simpler we will toss a properly biased coin to decide if 
using this facility counts as using a close facility. With this trick we, in a sense, 



distance 




average distance to distant facilities 
I 



(max) 



.maximal distance, to. close, facilities. 



df = d j (l-.R,-) 







average distance to close facilities 



close facilities 



— distant 

7 

facilities 




Fig. 1. Distances to facilities serving client j in x. The width of a rectangle correspond- 
ing to facility i is equal to Xij . Figure helps to understand the meaning of Rj . 

split such a facility into a close and a distant part. Note that we may only do 
it for this part of the analysis, but not for the actual rounding algorithm from 
Section 3.5. Applying the above-described split of the undecided facility, we get 
that the total fractional opening of close facilities of client j is exactly fj, and 
the total fractional opening of both close and distant facilities is at least "J - fj. 
Therefore, Corollary 1 yields the following: 

Corollary 4. The amount of close facilities used by client j in a solution de- 
scribed in Section 3.5 is expected to be at least (1 — -) ■ fj. 

Corollary 5. The amount of close and distant facilities used by client j in a 
solution described in Section 3.5 is expected to be at least (1 — — ) -fj. 

Motivated by the above bounds we design a selection method to choose a 
(large-enough in expectation) subset of facilities opened around client j: 

Lemma 2. For j G C we can select a subset Fj of open facilities from Cj U Dj 
such that: 




iGFj 



A rather technical but not difficult proof of the above lemma is given in 
Appendix C. 

4.3 Calculation 

We may now combine the pieces into the algorithm ALG: 

1. solve the LP-relaxation of (l)-(5); 

2. scale the fractional solution as described in Section 3.2; 

3. create a family of clusters as described in Section 3.4; 

4. round the fractional openings as described in Section 3.5; 

5. connect each client j to rj closest open facilities; 

6. output the solution as (x,y). 

Theorem 4. ALG is an approximation algorithm for FTFL. 

Proof. First observe that the solution produced by ALG is trivially feasible to 
the original problem (l)-(5), as we simply choose different rj facilities for client 
j in step 5. What is less trivial is that all the rj facilities used by j are within 
a certain small distance. Let us now bound the expected connection cost of the 
obtained solution. 

For each client j 6 C we get rj — Tj facilities opened in Step 2. As we already 
argued in Section 3.2, we may afford to connect j to these facilities and pay 
the connection cost from the difference between J^. CijXij and CifXij. We will 
now argue, that client j may connect to the remaining Tj with the expected 
connection cost bounded by J2i c ij^ij- 

For a special client j E (C \ C) we have Tj = 1 and already in Step 2 one 
special facility at distance dj 71 " 1 ^ from j is opened. We cannot blindly connect 

j to this facility, since ( ^ aax ^ ma y potentially be bigger then 7 • dj. What we 
do instead is that we first look at close facilities of j that, as a result of the 
rounding in Step 4, with a certain probability, give one open facility at a small 
distance. By Corollary 4 this probability is at least 1 — 1/e. It is easy to observe 

(c) 

that the expected connection cost to this open facility is at most dj . Only if 
no close facility is open, we use the special facility, which results in the expected 
connection cost of client j being at most 

(l-l/ e )4 c) +(l/e)4 d) < (l-l/ e )4 c > + (l/ e )^-(l+-^ T ) < dj(l+l/(e-(7-l)) < Tdj, 

where the first inequality is a consequence of Lemma 1, and the last one is a 
consequence of the choice of 7 w 1.7245. 

In the remaining, we only look at non-special clients j E C . By Lemma 2, 
client j may select to connect itself to the subset of open facilities Fj , and pay 

for this connection at most ((1 — 1/e) -Tj)-dj +(((1 — 4^) — (1 — 1/e)) -Tj)-dj in 
expectation. The expected number of facilities needed on top of those from Fj is 
Tj — E[\Fj\] = -Tj). These remaining facilities client j gets deterministically 



within the distance of at most 3 • d^ max ^ which is possible by the properties of 
the rounding procedure described in Section 3.5, see Corollary 3. Therefore, the 
expected connection cost to facilities not in Fj is at most -rj) ■ (3 • S- nax ^). 
Concluding, the total expected connection cost of j may be bounded by 

((1 - 1/e) • Tj) ■ df + (((1 - 1) - (1 - 1/e)) ■ rj) ■ df + (A • Ti) ■ (3 • df ax) ) 
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where the second inequality follows from Lemma 1 and the definition of Rj . 

Observe that for 1 < 7 < 2, we have ^^{^ - (1 - 1/e) > 0. Recall that b 
definition, Rj < 1; so, Rj — 1 is the worst case for our estimate, and therefore 



2 , „ A^ + Ve) „ , , A\ ^_ J , , 2 W1 , 1 



ryd^l + - + i?, ■ - (1 - 1/e) j j < rj .^.(l/e+-)(l+ — ). 

Recall that x incurs, for each client j, a fractional connection cost ^2 ie jr CijXij > 
7 • r,- • dj. We fix 7 = 70, such that 7o = (1/e + ^)(1 + ^) < 1.7245. 

To conclude, the expected connection cost of j to facilities opened during 
the rounding procedure is at most the fractional connection cost of x. The total 
connection cost is, therefore, at most the connection cost of x, which is at most 
7 times the connection cost of x*. 

By property (PI) of dependent rounding, every single facility i is opened with 
the probability y^, which is at most 7 times y*. Therefore, the total expected 
cost of the solution produced by ALG is at most 7 « 1.7245 times the cost of 
the fractional optimal solution (x*,y*). 

Concluding remarks. We have presented improved approximation algorithms 
for the metric Fault-Tolerant Uncapacitated Facility Location problem. The 
main technical innovation is the usage and analysis of dependent rounding in 
this context. We believe that variants of dependent rounding will also be fruitful 
in other location problems. Finally, we conjecture that the approximation thresh- 
old for both UFL and FTFL is the value 1.46 •• ■ suggested by [4]; it would be 
very interesting to prove or refute this. 
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Appendix 

A The rounding approach of [9] 

The dependent-rounding approach of [9] to round a given y = (j/i, 7/2, ■ • ■ , Un) £ 
[0, 1]^, is as follows. Suppose the current version of the rounded vector is v = 
(vi,V2, ■ ■ ■ ,vn) € [0, 1] N ', v is initially y. When we describe the random choice 
made in a step below, this choice is made independent of all such choices made 
thus far. If all the Vi lie in {0, 1}, we are done, so let us assume that there is at 
least one Vi S (0, 1). The first (simple) case is that there is exactly one Vi that 
lies in (0, 1); we round Vi in the natural way - to 1 with probability Vi, and to 
with complementary probability of 1 — v^, letting Vi denote the rounded version 
of Vi , we note that 

E[Vi] = (6) 

This simple step is called a Type I iteration, and it completes the rounding 
process. The remaining case is that of a Type II iteration: there are at least two 
components of v that lie in (0, 1). In this case we choose two such components 
Vi and Vj with i ^ j, arbitrarily. Let e and 6 be the positive constants such that: 
(i) Vi + e and vj — e lie in [0, 1], with at least one of these two quantities lying 
in {0, 1}, and (ii) Vi — S and Vj + S lie in [0, 1], with at least one of these two 
quantities lying in {0, 1}. It is easily seen that such strictly-positive e and S exist 
and can be easily computed. We then update (vi,vj) to a random pair (Vi,Vj) 
as follows: 



— with probability S/(e + 6), set (Vi, Vj) := (vi + e, vj — e); 

— with the complementary probability of e/(e+ (5), set (V,,Vj) := S, Vj+S). 

The main properties of the above that we will need are: 

PviVi + V^Vi+Vj] = 1; (7) 
E\Vi]=Vi and E[V j ]=v j ; (8) 
WiVj] < ViVj. (9) 

We iterate the above iteration until all we get a rounded vector. Since each 
iteration rounds at least one additional variable, we need at most TV iterations. 

Note that the above description does not specify the order in which the 
elements are rounded. Observe that we may use a predefined laminar family S 
of subsets to guide the rounding procedure. That is, we may first apply Type 
II iterations to elements of the smallest subsets, then continue applying Type 
II iterations for smallest subsets among those still containing more than one 
fractional entry, and eventually round the at most one remaining fractional entry 
with a Type I iteration. One may easily verify that executing the dependent 
rounding procedure in this manner we almost preserve the sum of entries within 
each of the subsets from our laminar family. 

B Proofs of the statements in Section 2 

Proof. (For Theorem 1) Recall that in the dependent-rounding approach, we 
begin with the vector = (yi,y2, ■ ■ ■ ,Un)', in each iteration t > 1, we start 
with a vector ?/ t_1 ) and probabilistically modify at most two of its entries, to 
produce the vector . We define a potential function <P(v^), which is a random 
variable that is fully determined by i.e., determined by the random choices 
made in iterations 1,2, ... ,t: 

1=0 ACS: \A\=e \ a£A b£(S-A) J 

Recall that dependent rounding terminates in some m < N iterations. A mo- 
ment's reflection shows that: 

#(«<°>) = n gx ,s(K(ym EW(v {m) )} = E[<?A,s(y)]. (ii) 

Our main inequality will be the following: 

Vt € [m], E[*(«<*>)] > E[*(« ( *-^)]. (12) 

This implies that 

E[<P(v (m) )] > E[#(v<°>)] = <Z>(w (0) ), 
which, in conjunction with (11) will complete our proof. 



Fix any t 6 [to], and fix any choice for the vector ?/* 1 - ) that happens with 
positive probability. Conditional on this choice, we will next prove that 

E[#(«W)] ^^{v^); (13) 

note that the expectation in the l.h.s. is only w.r.t. the random choice made in 
iteration t, since iA t_1 ) is now fixed. Once we have (13), (12) follows from Bayes' 
Theorem by a routine conditioning on the value of v^ t_1 K 

Let us show (13). We first dispose of two simple cases. Suppose iteration t is 
a Type I iteration, and that vf l ' is the only component of z/* -1 ) that lies in 
(0,1). Since 4>(v^) is a linear function of the random variable vf\ (13) holds 
with equality, by (6). A similar argument holds if iteration t is a Type II iteration 
in which the components vf and Vj are probabilistically altered in this 
iteration, if at most one of i and j lies in S. 

So suppose iteration t is a Type II iteration, and that both i and j lie in S 
(again, vf ^ and Vj ^ are the components altered in this iteration). Let Vi = 

Wj-* -1 -* and Vj = uj* for notational simplicity, and let V, and Vj denote their 
respective altered values. Note that there are deterministic reals uo,u\,U2,u 3 
which depend only on the components of v 1 -* -1 ) other than vf 1 ' and i>j* , 
such that 

^(w^*" 1 ^ = u Q + uiVi + u 2 Vj + uaViVj; 

= u + ujVi + u 2 Vj + u 3 ViVj. 

Therefore, in order to prove our desired bound (13), we have from (8) and (9) 
that is it is sufficient to show 

u 3 < 0, (14) 

which we proceed to do next. 

Let us analyze (10), the definition of ^, to calculate U3. Let, for < I < s, 
ot.g denote the contribution of the term 



a - e (ii^n n a-^))] 

ACS: \A\=l \ a£A bG(S-A) 



to u 3 ; note that 



u 3 = y^ai- 

£=0 



In order to compute the values ag, it is convenient to define certain quantities 
/3 r , which we do next. Define T = S — {i, j}, and note that \T\ = s — 2. For 
< r < s — 2, define 

p r = e f(iio-( n C 1 "^)) 

BC.T: \B\=r \ p£B gg (T-S) 



Now, as a warmup, note that ao = /3q and a s = (5 s -i- Let us next compute ag 
for 1 < £ < s — 1. The sum (15) can contribute a "vjp ■ Vj" term in three ways: 

— by taking both i and j in the set A in (15) - this is possible only if £ > 2 - 
with a coefficient of A^_2 for the • Vj term; 

— by taking both i and j in the set S — A in (15) - this is possible only if 

1 < s — 2 - with a coefficient of A^ for the • term; and 

— by taking exactly one of i and j in the set A - this is possible for any £ £ [s— 1] 
- with a coefficient of — 2A^/?£_i for the "uj*' • wj term (with the factor of 

2 arising from the choice of i or j to put in ^4). 

Rearranging the above three items, the contribution of f3 r to 1*3, for < r < 
s — 2, is A r — 2A r+ i + A r +2- That is, 

s-2 

u 3 = 2j(A r - 2A r+ i + A r+2 ) • P r - 

r=Q 

Thus, the hypothesis of the theorem and the fact that all the values (3 r are 
non-negative, together show that u 3 < as required by (14). 

Proof. (For Theorem 2) Let s = \S\. The theorem directly follows from prop- 
erty (PI) if cither ,s < 1 or k > s, so we may assume that ,s > 2 and that 
k < s — 1. Of course, we may also assume that k > 1. Note that for any 

x e {0, l}* 

minjfc, Sums(x)} = (V^ £ ■ I(Sumg(x) 

C<k l>k 

= 9\,s{x), 

where 

A = (0, l,2,...,fc,fc, k,...,k). 

It is easy to verify that for all < r < s — 2, X r — 2A r +i + A r +2 < 0. (Recall that 
1 < k < s — 1. The sum in the l.h.s. is zero for all r ^ k — 1, and equals —1 for 
r = k — 1. Thus we have the theorem, from Theorem 1. 

Proof. (For Theorem 3) Let Zi = We prove by induction on that 

E[min{fc,Sum s (ft(y))}] >k(l- JJ(1-^)). (16) 

This proves the theorem since the RHS above is at least k(l — exp(— ^2 ieS zi)) = 
fc(l — cxp(— Sumg(y)/fc)) (since t>l — cxp(— t) for all real t). 



We now establish (16) by induction on \S\. The base case when l^l = 1 is 
trivial. For notational simplicity, suppose that 1 £ S. For \S\ > 2, we have 

E[min{fc,Sum s (7e(y))}] = yi (l + E[min{fc - 1, Sum SN{1} (ft(j/))}]) + (1 - yi)E[min{fc, Sum S \ {1} {Tl(y))}} 

( k — 1 \ 
> J/i + E[min{fc,Sums\{i}(7£(y))}](yi • — h 1 - J/iJ 

= tfi + (l - ^-)E[min{fc, Sum S \ { i } TO))}] 

>ft(%+(i-«i)(i- n t 1 -*))) = fc(i-na-^))- 

teS\{l} ieS 

C Proof of a bound on the expected connection cost of a 
client 

Proof. (For Lemma 2) Given client j, fractional facility opening vector y, 
distances cy, requirement fj, and facility subsets Cj and Dj, we will describe 
how to randomly choose a subset of at most k = fj open facilities from Cj U Dj 
with the desired properties. 

Within this proof we will assume that all the involved numbers are rational. 
Recall that the opening of facilities is decided in a dependent rounding routine, 
that in a single step couples two fractional entries to leave at most one of them 
fractional. 

Observe that, for the purpose of this argument, we may split a single facility 
into many identical copies with smaller fractional opening. One may think that 
the input facilities and their original openings were obtained along the process 
of dependent rounding applied to the multiple "small" copies that we prefer to 
consider here. Therefore, without loss of generality, we may assume that all the 
facilities have fractional opening equal e, i.e., y i = e for all i £ CjUDj. Moreover, 
we may assume that sets Cj and Dj are disjoint. 

By renaming facilities we may obtain that Cj = { 1 , 2, . . . , | Cj \ } , Dj = {\Cj\ + 
1, . . . , \Cj\ + \Dj\}, and c t] < c V] for all 1 < i < i' < \Cj\ + \D 3 \. 

Consider random set £ CjUDj created as follows. Let y be the outcome of 
rounding the fractional opening vector y with the dependent rounding procedure, 
and define So = {i ■ yi = ^,(J2j<ii)) < By Corollary 1, we have that 
E[|So|] > k-(l — exp(— SumcjUDj (y)/k)). Define random set S a for a £ (0, \Cj\ + 
\Dj\] as follows. For i = 1,2,... |JCj| + \Dj \ — a\ we have i £ S a if and only 
if i £ So- For i = \\Cj \ + \Dj \ — a], in case i £ So we toss a (suitably biased) 
coin and include i in S a with probability a — [a\. For i > \\Cj \ + \Dj \ — a] we 
deterministically have i (£ S a . 

Observe that E^Sy] is a continuous monotone non-increasing function of a, 
therefore there exists ao such that E[|5 Qo |] = fc-(l — exp(— SurxiCjUDj (y)/k))- We 
fix Fj = S ao and claim that it has the desired properties. Clearly, by definition, 
we have E[|Fj|] = k ■ (1 — exp(— Sum^u^ (y)/k)) = (1 — 4^) - fj. We next show 
that the expected total connection cost between j and facilities in Fj is not too 
large. 



Let pf = Pr[i £ S a ] and p\ = p"° = Pr[i £ Fj]. Consider the cumulative 
probability denned as cp" = Ylj<iPf- Observe that application of Corollary 1 
to subsets of first i elements of Cj U Dj yields cp\ > k ■ (1 — cxp(— ei/k)) for 
i = l,..., \Cj\ + \Dj\. Since (1 — exp(— ei/k)) is a monotone increasing function 
of i one easily gets that also cpf > k ■ (1 — exp(— ei/k)) for a < ao and i = 
1, . . . , \Cj\ + \Dj\. In particular, we get cpj^!. > k ■ (1 — exp(— e\Cj\/k)). 

Since (1 — cxp(— ei/k)) is a concave function of i, we also have 

cvT > fc ■ (1 - exp(-ei/fc)) 

> (i/\Cj\)-k-(l-exp(-e\Cj\/k)) 

= {i/\C j \)-{l-±)-r j 

for all 1 < i < \Cj\. Analogously, we get 
cpT ^(^(l-expMQI/fc))) 

- IQD/I^-I) • fc • ((1 - exp( ' £(|Cjl fc +|Jjl) )) - (1 - cxp(- e |Q|/fc))) 

= r, ■ (1 - i) +f j ■ (((» - |Q|)/|^|)((1 - 1) - (1 - 1))) 

for all |Cj| < i < \Cj\ + |-Dj|. 

Recall that we want to bound E[^ ieJ7 . c,j] = X^iec ud K c y ■ F rom the above 
bounds on the cumulative probability, we get that, by shifting the probability 
from earlier facilities to later ones, one may obtain a probability vector p" with 
p'l = l/|Q|.((l-i)-r,)foralll < i < |C7;|, andtf = l/\D j n(l-±)-(l-±))-r j 
for all \Cj\ < i < \Cj \ + \Dj\. Since connection costs are monotone non- 
decreasing in i, when shifting the probability one never decreases the weighted 
sum, therefore 

= Y, l/\C- J \-{{l-- e )-r j )c l] 

l<i<\Cj\ 

+ E vi^i-(((i-^)-(i-^))-^ 

\Cj\<i<\Gj\+\ D i\ 

= ((1 - 1/e) • r,-) ■ df + (((1 - 1) - (1 - 1/e)) • f 3 ) • df . 



